policy group
Quantitative Representation of Scenario Difficulty for Autonomous Driving Based on Adversarial Policy Search
Yang, Shuo, Wang, Caojun, Zhang, Yuanjian, Yin, Yuming, Huang, Yanjun, Li, Shengbo Eben, Chen, Hong
Adversarial scenario generation is crucial for autonomous driving testing because it can efficiently simulate various challenge and complex traffic conditions. However, it is difficult to control current existing methods to generate desired scenarios, such as the ones with different conflict levels. Therefore, this paper proposes a data-driven quantitative method to represent scenario difficulty. Compared with rule-based discrete scenario difficulty representation method, the proposed algorithm can achieve continuous difficulty representation. Specifically, the environment agent is introduced, and a reinforcement learning method combined with mechanism knowledge is constructed for policy search to obtain an agent with adversarial behavior. The model parameters of the environment agent at different stages in the training process are extracted to construct a policy group, and then the agents with different adversarial intensity are obtained, which are used to realize data generation in different difficulty scenarios through the simulation environment. Finally, a data-driven scenario difficulty quantitative representation model is constructed, which is used to output the environment agent policy under different difficulties. The result analysis shows that the proposed algorithm can generate reasonable and interpretable scenarios with high discrimination, and can provide quantifiable difficulty representation without any expert logic rule design. The video link is https://www.youtube.com/watch?v=GceGdqAm9Ys.
- Asia > China > Shanghai > Shanghai (0.04)
- Europe > United Kingdom > England > Leicestershire > Loughborough (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Prioritized League Reinforcement Learning for Large-Scale Heterogeneous Multiagent Systems
Fu, Qingxu, Pu, Zhiqiang, Chen, Min, Qiu, Tenghai, Yi, Jianqiang
Large-scale heterogeneous multiagent systems feature various realistic factors in the real world, such as agents with diverse abilities and overall system cost. In comparison to homogeneous systems, heterogeneous systems offer significant practical advantages. Nonetheless, they also present challenges for multiagent reinforcement learning, including addressing the non-stationary problem and managing an imbalanced number of agents with different types. We propose a Prioritized Heterogeneous League Reinforcement Learning (PHLRL) method to address large-scale heterogeneous cooperation problems. PHLRL maintains a record of various policies that agents have explored during their training and establishes a heterogeneous league consisting of diverse policies to aid in future policy optimization. Furthermore, we design a prioritized policy gradient approach to compensate for the gap caused by differences in the number of different types of agents. Next, we use Unreal Engine to design a large-scale heterogeneous cooperation benchmark named Large-Scale Multiagent Operation (LSMO), which is a complex two-team competition scenario that requires collaboration from both ground and airborne agents. We use experiments to show that PHLRL outperforms state-of-the-art methods, including QTRAN and QPLEX in LSMO.
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > Promising Solution (0.48)
- Research Report > New Finding (0.46)
- Government > Military (0.68)
- Leisure & Entertainment > Games (0.46)
Learning Heterogeneous Agent Cooperation via Multiagent League Training
Fu, Qingxu, Ai, Xiaolin, Yi, Jianqiang, Qiu, Tenghai, Yuan, Wanmai, Pu, Zhiqiang
Many multiagent systems in the real world include multiple types of agents with different abilities and functionality. Such heterogeneous multiagent systems have significant practical advantages. However, they also come with challenges compared with homogeneous systems for multiagent reinforcement learning, such as the non-stationary problem and the policy version iteration issue. This work proposes a general-purpose reinforcement learning algorithm named Heterogeneous League Training (HLT) to address heterogeneous multiagent problems. HLT keeps track of a pool of policies that agents have explored during training, gathering a league of heterogeneous policies to facilitate future policy optimization. Moreover, a hyper-network is introduced to increase the diversity of agent behaviors when collaborating with teammates having different levels of cooperation skills. We use heterogeneous benchmark tasks to demonstrate that (1) HLT promotes the success rate in cooperative heterogeneous tasks; (2) HLT is an effective approach to solving the policy version iteration problem; (3) HLT provides a practical way to assess the difficulty of learning each role in a heterogeneous team.
A Reinforcement Learning Approach to Estimating Long-term Treatment Effects
Tang, Ziyang, Duan, Yiheng, Zhang, Stephanie, Li, Lihong
A/B tests) are a powerful tool for estimating treatment effects, to inform decisions making in business, healthcare and other applications. In many problems, the treatment has a lasting effect that evolves over time. A limitation with randomized experiments is that they do not easily extend to measure long-term effects, since running long experiments is time-consuming and expensive. In this paper, we take a reinforcement learning (RL) approach that estimates the average reward in a Markov process. Motivated by real-world scenarios where the observed state transition is nonstationary, we develop a new algorithm for a class of nonstationary problems, and demonstrate promising results in two synthetic datasets and one online store dataset.
- North America > Canada > Alberta (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
Apple's Photo-Scanning Plan Sparks Outcry From Policy Groups
More than 90 policy groups from the US and around the world signed an open letter urging Apple to drop its plan to have Apple devices scan photos for child sexual abuse material (CSAM). This story originally appeared on Ars Technica, a trusted source for technology news, tech policy analysis, reviews, and more. Ars is owned by WIRED's parent company, Condé Nast. "The undersigned organizations committed to civil rights, human rights, and digital rights around the world are writing to urge Apple to abandon the plans it announced on 5 August 2021 to build surveillance capabilities into iPhones, iPads, and other Apple products," the letter to Apple CEO Tim Cook said. "Though these capabilities are intended to protect children and to reduce the spread of child sexual abuse material (CSAM), we are concerned that they will be used to censor protected speech, threaten the privacy and security of people around the world, and have disastrous consequences for many children." The Center for Democracy and Technology (CDT) announced the letter, with CDT Security and Surveillance Project codirector Sharon Bradford Franklin saying, "We can expect governments will take advantage of the surveillance capability Apple is building into iPhones, iPads, and computers.
- North America > United States (0.16)
- South America > Peru (0.05)
- South America > Paraguay (0.05)
- (24 more...)
Uncertainty Estimation For Community Standards Violation In Online Social Networks
Torabi, Narjes, Arora, Nimar S., Yu, Emma, Shah, Kinjal, Liu, Wenshun, Tingley, Michael
Online Social Networks (OSNs) provide a platform for users to share their thoughts and opinions with their community of friends or to the general public. In order to keep the platform safe for all users, as well as to keep it compliant with local laws, OSNs typically create a set of community standards organized into policy groups, and use Machine Learning (ML) models to identify and remove content that violates any of the policies. However, out of the billions of content that is uploaded on a daily basis only a small fraction is so unambiguously violating that it can be removed by the automated models. Prevalence estimation is the task of estimating the fraction of violating content in the residual items by sending a small sample of these items to human labelers to get ground truth labels. This task is exceedingly hard because even though we can easily get the ML scores or features for all of the billions of items we can only get ground truth labels on a few thousands of these items due to practical considerations. Indeed the prevalence can be so low that even after a judicious choice of items to be labeled there can be many days in which not even a single item is labeled violating. A pragmatic choice for such low prevalence, $10^{-4}$ to $10^{-5}$, regimes is to report the upper bound, or $97.5\%$ confidence interval, prevalence (UBP) that takes the uncertainties of the sampling and labeling processes into account and gives a smoothed estimate. In this work we present two novel techniques Bucketed-Beta-Binomial and a Bucketed-Gaussian Process for this UBP task and demonstrate on real and simulated data that it has much better coverage than the commonly used bootstrapping technique.
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- (2 more...)
Centre forms policy group to study artificial intelligence: Nasscom - Times of India
MUMBAI: Amid a raging global debate on the consequences of artificial intelligence (AI), India has formed a "policy group" to study the new technologies and recommend a framework for its adoption, IT industry body Nasscom said today. "We all are currently working out on a policy framework on AI," its vice president K S Viswanathan told PTI, when asked about concerns over AI or the intelligence exhibited by machines. He said a "policy group" has been created by the Ministry of Electronics and Information Technology with representation from the academia, which has done a lot of research on the subject, and Nasscom for the industry's perspective. The group will focus on aspects like skilling the workforce, privacy, security and fixing responsibility if anything goes wrong, Viswanathan said. "We have to create a thought leadership on what is this programme all about, what is the likely impact. Create a thought leadership when AI becomes a reality, what are the elements and sub-elements which need to be taken care of, how do we take care of that," he said.
Centre forms policy group to study artificial intelligence
Amid a raging global debate on the consequences of artificial intelligence (AI), India has formed a "policy group" to study the new technologies and recommend a framework for its adoption, IT industry body Nasscom said today. "We all are currently working out on a policy framework on AI," its vice president K S Viswanathan told PTI, when asked about concerns over AI or the intelligence exhibited by machines. He said a "policy group" has been created by the Ministry of Electronics and Information Technology with representation from the academia, which has done a lot of research on the subject, and Nasscom for the industry's perspective. The group will focus on aspects like skilling the workforce, privacy, security and fixing responsibility if anything goes wrong, Viswanathan said. "We have to create a thought leadership on what is this programme all about, what is the likely impact. Create a thought leadership when AI becomes a reality, what are the elements and sub-elements which need to be taken care of, how do we take care of that," he said.
Centre forms policy group to study artificial intelligence Gadgets Now
MUMBAI: Amid a raging global debate on the consequences of artificial intelligence (AI), India has formed a "policy group" to study the new technologies and recommend a framework for its adoption, IT industry body Nasscom said today. "We all are currently working out on a policy framework on AI," its vice president K S Viswanathan told, when asked about concerns over AI or the intelligence exhibited by machines. He said a "policy group" has been created by the Ministry of Electronics and Information Technology with representation from the academia, which has done a lot of research on the subject, and Nasscom for the industry's perspective. The group will focus on aspects like skilling the workforce, privacy, security and fixing responsibility if anything goes wrong, Viswanathan said. No artificial intelligence is not taking away your jobs, claims this Capgemini report Globally, on average, 15% of the new job roles created by artificial intelligence were for staff members, that number is just 1% in India.